feat: build & test relayflows core against @agent-relay 8.x by willwashburn · Pull Request #2 · AgentWorkforce/relayflows

willwashburn · 2026-06-04T16:44:50Z

User description

Summary

Brings packages/core to a compiling, building state for the first time in this repo (it was imported from the relay monorepo mid-migration and never built here), migrates it off the removed v7 @agent-relay/sdk broker API onto @agent-relay/harness-driver 8.2, and rebrands the primitives to the @relayflows scope.

Depends on @agent-relay/harness-driver@8.2.0 (the SpawnedAgentHandle lifecycle primitive, added in AgentWorkforce/relay#1050).

Primitives

Rename @agent-relay/{github,slack,browser}-primitive → @relayflows/*, aligned to the 0.1.0 lineage. Updates core deps, imports, README, and the publish workflow's release notes + build order.

Core SDK migration (v7 → v8)

Broker surface swapped from the removed v7 AgentRelay to HarnessDriverClient: construction via HarnessDriverClient.spawn({ onStderr }); the 7 legacy named listeners rewritten as a single onEvent(BrokerEvent) dispatcher (the named event bus isn't fed in direct-client usage); spawnPty, listAgents/release, shutdown; idle-nudge via sendMessage.
Consumes the new SpawnedAgentHandle (waitForExit/waitForIdle/exitCode) through a thin local WorkflowAgentHandle wrapper that preserves the runner's existing string contract (so waitForExitWithIdleNudging is unchanged).
Vendored the deleted CLI utilities (getCliDefinition/resolveCliSync/resolveSpawnPolicy) into core/src/cli-registry.ts; stripAnsi → strip-ansi.
Bumped @agent-relay/{cloud,config,sdk,harness-driver,harnesses} to ^8.2.0; unified the repo on one SDK major (dropped browser-primitive's vestigial 7.1.1 pin).

✅ core + cli typecheck clean and build.

Tests — ⚠️ in progress (why this is a draft)

✅ Repaired monorepo-era import paths (../workflows/*, ../../provisioner/*) and the builtin-templates path (yaml-validation: 0 → 143 passing).
✅ Established + applied the HarnessDriverClient mock pattern: workflow-runner.test.ts 0 → 59/65, via a BrokerEvent-translating emitMockEvent shim + fail-fast reconciliation (the runner now defaults to strategy:'retry', so first-pass failure tests opt into fail-fast).
🔧 Remaining: migrate the same mock shape across ~11 more broker-mock test files (they currently spawn real brokers), and reconcile the supervised/owner/review behavioral assertions to the current runner. These were broken on extraction from the monorepo, independent of the SDK change.

See docs/sdk-v8-migration-plan.md for the full plan and findings.

🤖 Generated with Claude Code

Summary by cubic

Migrated packages/core from v7 @agent-relay/sdk to @agent-relay/harness-driver@8.2, rebranded primitives to @relayflows/*, and brought the core test suite to green.

Refactors
- Swapped v7 broker API for HarnessDriverClient with a single onEvent(BrokerEvent); preserved runner behavior via a WorkflowAgentHandle adapter over SpawnedAgentHandle.
- Vendored CLI helpers (getCliDefinition, resolveCliSync, resolveSpawnPolicy) into core/src/cli-registry.ts; switched to strip-ansi. Updated integrations to @relayflows/* and publish order to build primitives first.
- Minor cleanup: ignore **/.agent-relay/ in VCS and validate script file extensions before existence checks.
Tests
- Repaired and migrated the suite to the new driver: mock HarnessDriverClient, use the runner’s executor seam to avoid real spawns, and reconcile to { reason } results and fail-fast behavior. Full suite passing.
- Added vitest configs with a plugin to externalize the node:sqlite builtin and capped fork concurrency; updated trajectory path expectations and permission-audit path via @agent-relay/cloud’s getDefaultPermissionAuditPath.

^{Written for commit 4c32182. Summary will update on new commits.}

CodeAnt-AI Description

Move core to the v8 agent runtime and rename the primitives to @relayflows

What Changed

Core now uses the v8 harness driver for agent spawning, lifecycle tracking, idle nudges, and broker events.
The workflow runner keeps the same user-facing behavior while handling the new agent handle format behind the scenes.
GitHub, Slack, and Browser primitives were renamed to the @relayflows/* package scope, and the publish workflow now builds them before core.
Workflow tests were updated to match the new runtime behavior, including fail-fast cases, review decisions, idle nudges, and trajectory file locations.

Impact

✅ Workflows run against the v8 agent runtime
✅ Fewer breakages from renamed packages
✅ Clearer workflow test coverage for retry, review, and idle behavior

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

Brings packages/core to a compiling, building state for the first time in this repo (it was imported mid-migration and never built here), and rebrands the primitives to the @relayflows scope. Primitives: - Rename @agent-relay/{github,slack,browser}-primitive -> @relayflows/*, aligned to the 0.1.0 lineage; update core deps, imports, README, and the publish workflow's release notes + build order. Core SDK migration (v7 -> v8): - Swap the broker surface from the v7 `AgentRelay` (removed API) to `@agent-relay/harness-driver`'s `HarnessDriverClient`: construction via `HarnessDriverClient.spawn({ onStderr })`, the 7 legacy named listeners rewritten as a single `onEvent(BrokerEvent)` dispatch, `spawnPty`, `listAgents`/`release`, `shutdown`, and idle-nudge via `sendMessage`. - Consume the new `SpawnedAgentHandle` (waitForExit/waitForIdle/exitCode, added upstream in harness-driver 8.2) through a thin local WorkflowAgentHandle wrapper that preserves the runner's string contract. - Vendor the deleted CLI utilities (getCliDefinition/resolveCliSync/ resolveSpawnPolicy) into core/src/cli-registry.ts; stripAnsi -> strip-ansi. - Bump @agent-relay/{cloud,config,sdk,harness-driver,harnesses} to ^8.2.0 and unify the repo on a single SDK major (drop browser-primitive's vestigial 7.1.1 pin). core + cli typecheck clean and build. Tests (in progress): repaired monorepo-era import paths (../workflows/* and ../../provisioner/*), fixed the builtin-templates path (yaml-validation: 0 -> 143), and migrated workflow-runner.test.ts to the HarnessDriverClient mock with fail-fast reconciliation (now 59/65). Remaining broker-mock files and supervised-flow behavioral reconciliation are still to do. See docs/sdk-v8-migration-plan.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

gemini-code-assist · 2026-06-04T16:44:53Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

coderabbitai · 2026-06-04T16:44:58Z

📝 Walkthrough

Walkthrough

This PR migrates relayflows from @agent-relay/sdk v7 to v8, rebranding primitive packages to @relayflows/*, introducing a WorkflowAgentHandle adapter, vendoring CLI utilities locally, and refactoring the core runner for harness-driver broker integration with comprehensive test rewiring.

Changes

Core SDK v8 migration and architecture update

Layer / File(s)	Summary
Package metadata updates and dependency alignment `packages/browser-primitive/package.json`, `packages/core/package.json`, `packages/github-primitive/package.json`, `packages/slack-primitive/package.json`	Primitive packages renamed to `@relayflows/` at v0.1.0 (from `@agent-relay/` v7.1.1); core dependencies upgraded to `@agent-relay/*` v8.2.0 with new `@agent-relay/harness-driver` addition.
WorkflowAgentHandle adapter for harness-driver compatibility `packages/core/src/agent-handle.ts`	New `WorkflowAgentHandle` wrapper adapts v8 `SpawnedAgentHandle` lifecycle results (`{ reason }` objects) to runner's string contract (`'exited' \| 'idle' \| 'timeout'`).
CLI utility vendoring into local registry `packages/core/src/cli-registry.ts`, `packages/core/src/process-spawner.ts`, `packages/core/src/proxy-env.ts`, `packages/core/src/channel-messenger.ts`	Consolidated `cli-registry.ts` provides CLI definitions, resolution (async/sync with memoization), and spawn-policy construction; updated consumers to import locally instead of from SDK; `process-spawner` gains onChunk callback and early-completion handling.
Builder and run option type updates `packages/core/src/builder.ts`, `packages/core/src/run.ts`	`WorkflowRunOptions.relay` retyped from `AgentRelayOptions` to `RuntimeSpawnOptions` to align with v8 broker API.
Core runner broker and agent lifecycle migration `packages/core/src/runner.ts`	Major refactor from `AgentRelay` listener hooks to unified `HarnessDriverClient.onEvent` dispatcher; `spawnPty` constructs `WorkflowAgentHandle`; stale-agent cleanup via `.release(name)`; improved owner-timeout precedence; evidence-based review decision parsing; idle nudging via `.sendMessage()`.
Test infrastructure and mock rewiring `packages/core/src/__tests__/*.test.ts`	Harness-driver mocking with `makeMockHandle()` PTY helpers; centralized `eventListeners` registry bridging legacy event names to broker event shapes; test paths aligned to new module structure; event isolation via `beforeEach` cleanup; many tests set `errorHandling.strategy: 'fail-fast'` for deterministic behavior.
Integration imports update to new primitive scopes `packages/core/src/integrations/browser.ts`, `packages/core/src/integrations/github.ts`, `packages/core/src/integrations/slack.ts`	Browser, GitHub, and Slack integrations updated to import from `@relayflows/*` scoped packages.
Documentation, workflow, and configuration updates `.github/workflows/publish.yml`, `.gitignore`, `docs/sdk-v8-migration-plan.md`, `packages/core/vitest.config.ts`, `packages/github-primitive/README.md`	Migration plan added with workstreams (WS-0 to WS-5) and effort estimate; release workflow reordered to build primitives first and release notes updated; `.gitignore` extended for npm cache and runtime directories; vitest configuration and node:sqlite plugin added; GitHub primitive README updated.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

"🐰 In SDK eight, our runner takes flight,
With harness drivers shining bright,
Primitives rebranded with relayflows' name,
Agent handles adapted—same dance, different frame!"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 43.90% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: build & test relayflows core against `@agent-relay` 8.x' accurately reflects the main objective of migrating core to v8 and establishing a compiling/testable state.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The PR description comprehensively describes the migration from v7 `@agent-relay/sdk` to v8 `@agent-relay/harness-driver`, rebranding of primitives, vendor changes, and test updates.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/sdk-v8-migration

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-06-04T16:48:13Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dda4aafc48

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-04T16:52:02Z

+      this.log(`[${step.name}] Spawning ${agentDef.cli} (pty)`);
+      agent = new WorkflowAgentHandle(
+        await this.relay.spawnPty({
          ...(spawnOptions as Record<string, unknown>),
          cli: agentDef.cli,
-        } as Parameters<AgentRelay['spawnPty']>[0]);
-      }
+        } as SpawnPtyInput)


Preserve headless spawning for OpenCode agents

When agentDef.cli is opencode, this path now always calls spawnPty and even logs (pty), whereas the previous SDK-spawner path requested the OpenCode headless runtime. I checked the Agent Relay TypeScript SDK reference, which says to use runtime: 'headless' for app-server sessions such as OpenCode; with this change any workflow using an OpenCode agent is launched through the PTY transport instead of the supported headless transport, so those steps can fail or hang rather than execute normally.

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

packages/core/src/runner.ts (2)
6789-6803: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Hub nudge lookup is using logical names against a runtime-name map.

activeAgentHandles is populated with spawned runtime names, but resolveHubForNudge() probes it with agentDef.name from config. In supervised flows those names differ (<step>-owner-<runId>, etc.), so hub-mediated nudges never find the live coordinator and always fall back to the direct path. Resolve this via the logical-name metadata in runtimeStepAgents/supervisedRuntimeAgents, or search the live handles by logical agent identity instead of map key.

Also applies to: 6845-6845, 7219-7243
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/runner.ts` around lines 6789 - 6803, The hub nudge lookup
fails because resolveHubForNudge() probes activeAgentHandles by config
agentDef.name (logical name) while activeAgentHandles keys are spawned runtime
names; update resolveHubForNudge (and its call sites around
resolveSignalParticipantKind/rememberStepSignalSender) to resolve runtime handle
by using the logical-name metadata stored in runtimeStepAgents and
supervisedRuntimeAgents (or fall back to scanning activeAgentHandles values for
matching logicalName/options.logicalName) and then use that runtime name to
query activeAgentHandles; ensure calls that currently pass agentDef.name pass
the logical identifier and the resolver performs the reverse lookup to return
the correct runtime handle for hub-mediated nudges.
6070-6076: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Only finalize review early on an explicit REVIEW_DECISION line.

onChunk() now runs the full heuristic parser on partial output and immediately releases the reviewer handle when it sees a match. That makes the new evidence fallback fail-open: a chunk like looks good so far... or an echoed safe handoff phrase can be locked in as approved before the final verdict arrives. Keep the streaming path strict (REVIEW_DECISION: only), and apply the evidence/indecision fallbacks only after the reviewer exits.
Suggested direction
         onChunk: ({ chunk }) => {
           const nextOutput = reviewOutput + WorkflowRunner.stripAnsi(chunk);
           reviewOutput = nextOutput;
-          const parsed = this.parseReviewDecision(nextOutput);
+          const parsed = this.parseStrictReviewDecision(nextOutput);
           if (parsed) {
             startReviewCompletion(parsed);
           }
         },
Also applies to: 6121-6129, 6279-6313
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/runner.ts` around lines 6070 - 6076, The stream handler
currently runs the full heuristic parser on every partial chunk (in onChunk) and
may call startReviewCompletion too early; change onChunk so it only triggers
parseReviewDecision/startReviewCompletion when the chunk contains an explicit
REVIEW_DECISION: token (e.g., test for the literal prefix or regex like
/^\s*REVIEW_DECISION:/ before calling this.parseReviewDecision), leaving
heuristic/evidence and indecision fallbacks to run only after the reviewer
process exits; update the analogous handlers that call
this.parseReviewDecision/startReviewCompletion (the other occurrences noted near
the blocks at 6121-6129 and 6279-6313) to follow the same rule.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/sdk-v8-migration-plan.md`:
- Around line 94-101: The docs sections (WS-1/WS-4) still describe the old
AgentRelay/messaging APIs and conflict with the implemented runner direction;
update these sections to either (a) replace all references to AgentRelay option
changes with the actual HarnessDriverClient lifecycle APIs (and update mentions
in builder.ts, run.ts, runner.ts references) or (b) clearly mark the AgentRelay
text as a historical draft and add a short “historical vs final approach” note
that points to HarnessDriverClient; specifically revise the wording around
AgentRelay, AgentRelayOptions, and where env/brokerName/requestTimeoutMs map
(now handled via HarnessDriverClient/SpawnRuntimeInput, broker transport config,
and retryPolicy) and ensure the cross-references in builder.ts, run.ts, and
runner.ts are consistent with the final approach.
- Around line 41-43: The event-surface guidance is wrong: update the table
entries that show session-style named listeners (e.g., the
`addListener('workerOutput'|'messageReceived'|...)` row) to state that direct
clients consume a unified BrokerEvent via `onEvent(BrokerEvent)` instead of
remapped session listeners; also clarify that `addListener<K extends keyof
HarnessDriverEvents>(event, handler)` remains for driver-side hooks but is not
the client-facing model, and change the `onBrokerStderr(cb)` note to reflect the
`onStderr` construction option; apply the same corrections to the duplicated
section later in the doc where the session listener mapping is repeated.
- Around line 3-5: Update the migration plan's metadata to reflect the merged
v8.2 state: change the status text on the current "Status: proposed, not
started." line to indicate the migration is complete (e.g., "Status: merged /
completed") and update all version references currently showing "8.0.4" and
"8.1.2" to the current dependency range "^8.2.0"; apply the same edits to the
other occurrences referenced (including the block around lines 167-181) so the
document consistently reflects core being migrated and relay deps at ^8.2.0.

In `@packages/core/src/__tests__/resume-fallback.test.ts`:
- Around line 118-119: The test currently mocks legacy ../relay.js but
WorkflowRunner initializes this.relay via HarnessDriverClient.spawn (so broker
events use relay.onEvent/connectEvents) and requires a broker for agent steps
during runner.resume; replace the vi.mock('../relay.js', ...) with
vi.mock('`@agent-relay/harness-driver`', ...) and provide a mock object matching
the HarnessDriverClient surface used by the runner: export a spawn that returns
an object with onEvent, connectEvents, spawnPty (and any minimal behavior needed
for tests), ensuring WorkflowRunner (created by importing ../runner.js) gets
this mocked HarnessDriverClient so requiresBroker and relay event dispatch work
in the test.

In `@packages/core/src/__tests__/start-from.test.ts`:
- Around line 121-122: The test currently mocks the legacy relay API but not the
new harness-driver contract used by WorkflowRunner; update start-from.test.ts to
mock '`@agent-relay/harness-driver`' so that HarnessDriverClient.spawn(...) is
replaced with a test stub that returns an object matching the broker used by
WorkflowRunner (i.e., implement spawn to return an object with an onEvent method
and any other methods WorkflowRunner expects), remove or stop relying on legacy
../relay.js/addListener mocks, and ensure the test triggers relay events by
calling the stubbed broker.onEvent handlers so the start-from path exercises the
real broker lifecycle APIs used by WorkflowRunner and this.relay.onEvent.

In `@packages/core/src/__tests__/workflow-runner.test.ts`:
- Around line 72-84: The tests mock for spawnPty currently returns bare strings
from waitForExit/waitForIdle but WorkflowAgentHandle expects an object shape {
reason }; update the mock so waitForExit and waitForIdle resolve to an object
with the reason field (e.g., { reason: <value> }) — adjust the spawnPty mock or
replace its return with the existing makeMockHandle(name) factory (which exposes
waitForExit/waitForIdle that resolve to { reason }) or wrap
waitForExitFn/waitForIdleFn results before returning so all callers receive {
reason } instead of a raw string.

In `@packages/core/src/runner.ts`:
- Around line 4342-4346: The getFailureResult() branches are unconditionally
printing error.stack when the message matches /reading 'get'/ which leaks
stacks; remove the console.error('DEBUG_STACK>>>', error.stack) calls inside
those inline error serialization blocks (the getFailureResult implementations)
and instead rely on the existing RF_DEBUG_STACK gated debug logging used in the
surrounding catch handlers; if you need to preserve stack output here, guard it
with the same env check (process.env.RF_DEBUG_STACK) or call the centralized
debug logger rather than printing unconditionally. Also remove the identical
unconditional stack dump in the second failure-serialization branch referenced
in the comment (the other getFailureResult-like block) so both locations use the
env-gated debug path.

---

Outside diff comments:
In `@packages/core/src/runner.ts`:
- Around line 6789-6803: The hub nudge lookup fails because resolveHubForNudge()
probes activeAgentHandles by config agentDef.name (logical name) while
activeAgentHandles keys are spawned runtime names; update resolveHubForNudge
(and its call sites around
resolveSignalParticipantKind/rememberStepSignalSender) to resolve runtime handle
by using the logical-name metadata stored in runtimeStepAgents and
supervisedRuntimeAgents (or fall back to scanning activeAgentHandles values for
matching logicalName/options.logicalName) and then use that runtime name to
query activeAgentHandles; ensure calls that currently pass agentDef.name pass
the logical identifier and the resolver performs the reverse lookup to return
the correct runtime handle for hub-mediated nudges.
- Around line 6070-6076: The stream handler currently runs the full heuristic
parser on every partial chunk (in onChunk) and may call startReviewCompletion
too early; change onChunk so it only triggers
parseReviewDecision/startReviewCompletion when the chunk contains an explicit
REVIEW_DECISION: token (e.g., test for the literal prefix or regex like
/^\s*REVIEW_DECISION:/ before calling this.parseReviewDecision), leaving
heuristic/evidence and indecision fallbacks to run only after the reviewer
process exits; update the analogous handlers that call
this.parseReviewDecision/startReviewCompletion (the other occurrences noted near
the blocks at 6121-6129 and 6279-6313) to follow the same rule.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: de9e1424-4786-4a36-bca2-33414ec0b439

📥 Commits

Reviewing files that changed from the base of the PR and between ab6bab3 and dda4aaf.

⛔ Files ignored due to path filters (1)

package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (31)

.github/workflows/publish.yml
.gitignore
docs/sdk-v8-migration-plan.md
packages/browser-primitive/package.json
packages/core/package.json
packages/core/src/__tests__/completion-pipeline.test.ts
packages/core/src/__tests__/e2e-owner-review.test.ts
packages/core/src/__tests__/e2e-permissions.test.ts
packages/core/src/__tests__/error-scenarios.test.ts
packages/core/src/__tests__/idle-nudge.test.ts
packages/core/src/__tests__/permission-types.test.ts
packages/core/src/__tests__/permissions-integration.test.ts
packages/core/src/__tests__/resume-fallback.test.ts
packages/core/src/__tests__/start-from.test.ts
packages/core/src/__tests__/workflow-runner.test.ts
packages/core/src/__tests__/yaml-validation.test.ts
packages/core/src/agent-handle.ts
packages/core/src/builder.ts
packages/core/src/channel-messenger.ts
packages/core/src/cli-registry.ts
packages/core/src/integrations/browser.ts
packages/core/src/integrations/github.ts
packages/core/src/integrations/slack.ts
packages/core/src/process-spawner.ts
packages/core/src/proxy-env.ts
packages/core/src/run.ts
packages/core/src/runner.ts
packages/core/vitest.config.ts
packages/github-primitive/README.md
packages/github-primitive/package.json
packages/slack-primitive/package.json

coderabbitai · 2026-06-04T16:59:21Z

+> Status: **proposed, not started.** This plan was produced from a read-only
+> investigation of `packages/core` and the relay monorepo HEAD (all `@agent-relay/*`
+> at `8.0.4` source / `8.1.2` npm). Review before any code is written.


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update plan status/version references to reflect the merged v8.2 migration state.

Line 3 (“proposed, not started”) and Line 5 (8.0.4/8.1.2) now conflict with the current repo state (core already migrated and dependencies at ^8.2.0). This makes the plan misleading for follow-up work and backport decisions.

Also applies to: 167-181

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/sdk-v8-migration-plan.md` around lines 3 - 5, Update the migration plan's metadata to reflect the merged v8.2 state: change the status text on the current "Status: proposed, not started." line to indicate the migration is complete (e.g., "Status: merged / completed") and update all version references currently showing "8.0.4" and "8.1.2" to the current dependency range "^8.2.0"; apply the same edits to the other occurrences referenced (including the block around lines 167-181) so the document consistently reflects core being migrated and relay deps at ^8.2.0.

coderabbitai · 2026-06-04T16:59:21Z

+| `addListener('workerOutput'\|'messageReceived'\|'agentSpawned'\|'agentReleased'\|'agentExited'\|'agentIdle'\|'deliveryUpdate', …)` | same `addListener<K extends keyof HarnessDriverEvents>(event, handler)` — **same event names** | ✅ verbatim |
+| `workerOutput {name, chunk}` / `messageReceived {eventId,from,to,text,threadId}` / `agentIdle {name,idleSecs}` payloads | `WorkerOutputPayload` / `DriverMessage` / `AgentIdlePayload` — **same field names** | ✅ verbatim |
+| `onBrokerStderr(cb)` | `onStderr` **construction option** | ✅ moved to ctor |


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Correct the event-surface guidance: direct clients consume BrokerEvent via onEvent, not the named/session listener mapping shown here.

The current runner integration is built around onEvent(BrokerEvent) dispatch (not the session-style message.created/terminal.output remap in this table). Please update this section to avoid steering future migrations toward a non-working listener model.

Also applies to: 121-133

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/sdk-v8-migration-plan.md` around lines 41 - 43, The event-surface guidance is wrong: update the table entries that show session-style named listeners (e.g., the `addListener('workerOutput'|'messageReceived'|...)` row) to state that direct clients consume a unified BrokerEvent via `onEvent(BrokerEvent)` instead of remapped session listeners; also clarify that `addListener<K extends keyof HarnessDriverEvents>(event, handler)` remains for driver-side hooks but is not the client-facing model, and change the `onBrokerStderr(cb)` note to reflect the `onStderr` construction option; apply the same corrections to the duplicated section later in the doc where the session listener mapping is repeated.

coderabbitai · 2026-06-04T16:59:21Z

+### WS-1 — AgentRelay construction & options (`runner.ts:3145`, types in `builder.ts:4`, `run.ts:1`)
+- v7 `new AgentRelay({ brokerName, channels, env, requestTimeoutMs })` →
+  v8 `new AgentRelay({ workspaceKey, baseUrl, retryPolicy, harness })`.
+- `env` / `brokerName` / `requestTimeoutMs` are gone from options. Decide where
+  each goes: `env` → harness-driver `SpawnRuntimeInput`; `brokerName` → broker
+  transport config; `requestTimeoutMs` → `retryPolicy`.
+- Update the `AgentRelayOptions` type references in `builder.ts`, `run.ts`,
+  `runner.ts:481`.


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

WS-1/WS-4 still describe superseded AgentRelay/messaging APIs; align these steps to HarnessDriverClient lifecycle APIs or mark them as historical.

As written, these sections conflict with the implemented runner direction and can cause incorrect follow-on edits in runner.ts/tests. A brief “historical draft vs final approach” note would prevent confusion.

Also applies to: 140-144

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/sdk-v8-migration-plan.md` around lines 94 - 101, The docs sections (WS-1/WS-4) still describe the old AgentRelay/messaging APIs and conflict with the implemented runner direction; update these sections to either (a) replace all references to AgentRelay option changes with the actual HarnessDriverClient lifecycle APIs (and update mentions in builder.ts, run.ts, runner.ts references) or (b) clearly mark the AgentRelay text as a historical draft and add a short “historical vs final approach” note that points to HarnessDriverClient; specifically revise the wording around AgentRelay, AgentRelayOptions, and where env/brokerName/requestTimeoutMs map (now handled via HarnessDriverClient/SpawnRuntimeInput, broker transport config, and retryPolicy) and ensure the cross-references in builder.ts, run.ts, and runner.ts are consistent with the final approach.

coderabbitai · 2026-06-04T16:59:21Z

+const { WorkflowRunner } = await import('../runner.js');
+const { workflow } = await import('../builder.js');


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check runtime contract vs current test mock style. rg -n "HarnessDriverClient|onEvent\\(" packages/core/src/runner.ts -C2 rg -n "vi\\.mock\\('../relay\\.js'|addListener\\(|vi\\.mock\\('`@agent-relay/harness-driver`'|onEvent\\()" \ packages/core/src/__tests__/start-from.test.ts -C2

Repository: AgentWorkforce/relayflows

Length of output: 1456

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Confirm runner.ts uses harness-driver client + onEvent broker events rg -n "import \\{ HarnessDriverClient \\} from '`@agent-relay/harness-driver`'|\\.onEvent\\(" packages/core/src/runner.ts -C2 # 2) Inspect start-from.test.ts around the shown import lines sed -n '90,170p' packages/core/src/__tests__/start-from.test.ts | nl -ba | sed -n '1,120p' # 3) Find mocks/doubles in start-from.test.ts for relay.js/addListener/harness-driver usage rg -n "vi\\.mock\\(|\\baddListener\\b|`@agent-relay/harness-driver`|onEvent\\(" packages/core/src/__tests__/start-from.test.ts -C2

Repository: AgentWorkforce/relayflows

Length of output: 802

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="packages/core/src/__tests__/start-from.test.ts" # 1) Show the exact area around the reported dynamic imports (approx.) sed -n '105,145p' "$FILE" | cat -n # 2) Look for relay.js / addListener mocks in the test (separately, to avoid regex errors) rg -n "vi\\.mock" "$FILE" rg -n "addListener" "$FILE" rg -n "relay\\.js" "$FILE" rg -n "`@agent-relay/harness-driver`" "$FILE" # 3) Confirm runner mock usage style via onEvent / HarnessDriverClient references in the test rg -n "onEvent\\(" "$FILE" rg -n "HarnessDriverClient" "$FILE"

Repository: AgentWorkforce/relayflows

Length of output: 1924

🏁 Script executed:

#!/bin/bash set -euo pipefail # Inspect runner.ts dependencies/migration status rg -n "from '\\.\\/relay\\.js'|from \"\\.\\/relay\\.js\"|\\bAgentRelay\\b|addListener" packages/core/src/runner.ts rg -n "HarnessDriverClient|onEvent\\(" packages/core/src/runner.ts # Inspect the start-from test mocks and whether it references onEvent/harness-driver FILE="packages/core/src/__tests__/start-from.test.ts" rg -n "HarnessDriverClient|`@agent-relay/harness-driver`|onEvent\\(" "$FILE" sed -n '60,140p' "$FILE" | cat -n sed -n '140,240p' "$FILE" | cat -n

Repository: AgentWorkforce/relayflows

Length of output: 212

🏁 Script executed:

#!/bin/bash set -euo pipefail # runner.ts: look for legacy relay.js/AgentRelay/addListener references rg -n "relay\\.js" packages/core/src/runner.ts rg -n "AgentRelay" packages/core/src/runner.ts rg -n "addListener" packages/core/src/runner.ts # runner.ts: harness-driver contract references rg -n "`@agent-relay/harness-driver`|HarnessDriverClient|onEvent\\(" packages/core/src/runner.ts # start-from.test.ts: harness-driver/onEvent references FILE="packages/core/src/__tests__/start-from.test.ts" rg -n "`@agent-relay/harness-driver`|HarnessDriverClient|onEvent\\(" "$FILE"

Repository: AgentWorkforce/relayflows

Length of output: 51

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="packages/core/src/runner.ts" # Show runner.ts imports area (first ~80 lines) sed -n '1,120p' "$FILE" | cat -n # Search runner.ts for relay.js / relay identifiers in simpler way rg -n "relay" "$FILE" rg -n "from '\\./relay'|from '\\./relay\\.js'|from \"\\./relay\"|from \"\\./relay\\.js\"|relay\\.js" "$FILE" || true # Find where WorkflowRunner is exported/defined rg -n "export (class|function|const) WorkflowRunner|class WorkflowRunner|WorkflowRunner" "$FILE" | head -n 50

Repository: AgentWorkforce/relayflows

Length of output: 14764

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) See if runner.ts imports the relay module that the test mocks (../relay.js => src/relay.js) FILE="packages/core/src/runner.ts" rg -n "from '\\.\\/relay(\\.|\\/|')|from \"\\.\\/relay(\\.|\\/|')|from '\\.\\/relay\\.js'|from \"\\.\\/relay\\.js'|from '\\.\\/relay\\.js'|from \"\\.\\/relay\\.js\"|relay\\.js" "$FILE" || true rg -n "\\baddListener\\b|listAgentsRaw|AgentRelay|AgentRelayClient" "$FILE" || true # 2) Inspect the test's relay mock implementation to understand what contract it provides TEST="packages/core/src/__tests__/start-from.test.ts" rg -n "const mockRelayInstance|vi\\.mock\$'\\.\\./relay\\.js'\$|vi\\.mock\$\"\\.\\./relay\\.js\"\$|addListener|listAgentsRaw|emitRelayEvent" "$TEST" sed -n '1,140p' "$TEST" | cat -n # 3) Inspect the mocked relay module itself (src/relay.*) for whether it still exposes addListener # Locate relay files find packages/core/src -maxdepth 1 -type f -name "relay.*" -o -name "*relay*" | head -n 50

Repository: AgentWorkforce/relayflows

Length of output: 6426

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Find any import/require of relay layer from runner.ts FILE="packages/core/src/runner.ts" rg -n "from '\\./relay|from \"\\./relay|require\\(|\\./relay\\.js|relay\\.js" "$FILE" || true rg -n "import\\s+\\{[^}]*\\}\\s+from\\s+'\\./relay" "$FILE" || true rg -n "AgentRelay" "$FILE" || true rg -n "addListener" "$FILE" || true # 2) Inspect runner.ts region where HarnessDriverClient.spawn + onEvent are used rg -n "HarnessDriverClient\\.spawn\\(|this\\.relay\\.onEvent\\(" "$FILE" # show around those lines start=3100 end=3180 sed -n "${start},${end}p" "$FILE" | cat -n # 3) Also check runner.ts for any "relay.js" direct usage rg -n "new\\s+AgentRelay|AgentRelay\\(" "$FILE" || true

Repository: AgentWorkforce/relayflows

Length of output: 5105

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="packages/core/src/__tests__/start-from.test.ts" # Find where workflow() is defined/used rg -n "workflow\\(" "$FILE" rg -n "new WorkflowRunner|WorkflowRunner\\b" "$FILE" rg -n "startFrom|startFromStep|start-from|startFrom" "$FILE" rg -n "AGENT_RELAY_WORKFLOW_DISABLE_RELAYCAST|requiresBroker|relaycastDisabled" "$FILE" # Show larger windows around the first workflow() usage and runner invocation. # (Use sed ranges based on first matches.) FIRST_WORKFLOW_LINE=$(rg -n "workflow\\(" "$FILE" | head -n 1 | cut -d: -f1) FIRST_RUNNER_LINE=$(rg -n "new WorkflowRunner|WorkflowRunner\\b" "$FILE" | head -n 1 | cut -d: -f1 || true) echo "First workflow line: ${FIRST_WORKFLOW_LINE:-N/A}" echo "First runner line: ${FIRST_RUNNER_LINE:-N/A}" if [ -n "${FIRST_WORKFLOW_LINE:-}" ]; then start=$((FIRST_WORKFLOW_LINE-60)); end=$((FIRST_WORKFLOW_LINE+220)) sed -n "${start},${end}p" "$FILE" | cat -n fi # Also show where tests configure step types (search for "type:" in workflow definition) rg -n "type:\\s*'|type:\\s*\"" "$FILE" | head -n 80

Repository: AgentWorkforce/relayflows

Length of output: 1655

🏁 Script executed:

#!/bin/bash set -euo pipefail TEST="packages/core/src/__tests__/start-from.test.ts" RUNNER="packages/core/src/runner.ts" # Show workflow config definition around the reported line (~74) and the execute calls around runner.execute. sed -n '140,240p' "$TEST" | cat -n sed -n '240,380p' "$TEST" | cat -n # Locate the "workflow('test')" config and show its content. # (First show ~100 lines around that occurrence.) LINE=$(rg -n "const config = workflow\\(" "$TEST" | head -n 1 | cut -d: -f1) start=$((LINE-80)); end=$((LINE+200)) sed -n "${start},${end}p" "$TEST" | cat -n # Inspect runner.ts for how `this.executor` is set/used near requiresBroker. rg -n "this\\.executor" "$RUNNER" # Show the block around the requiresBroker computation we already saw (~3100-3180), plus earlier few lines. sed -n '3000,3125p' "$RUNNER" | cat -n sed -n '3125,3205p' "$RUNNER" | cat -n

Repository: AgentWorkforce/relayflows

Length of output: 27303

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Confirm whether start-from.test.ts mocks harness-driver at all FILE="packages/core/src/__tests__/start-from.test.ts" rg -n "`@agent-relay/harness-driver`|HarnessDriverClient|spawnPty\$|\\.onEvent\\(" "$FILE" || true # 2) Inspect how workflow steps are typed when config steps omit explicit `type` # Search for step parsing/typing logic around `deterministic`/`worktree`/`integration` rg -n "deterministic'|worktree'|integration'|step\\.type|type:\\s*['\"]deterministic['\"]" packages/core/src | head -n 80 # Narrow: search for where step objects are normalized/validated rg -n "normalize.*step|resolve.*step|validate.*step|infer.*step|apply.*step" packages/core/src | head -n 80 # 3) Find where RelayYamlConfig.workflows.steps are converted into internal WorkflowStep objects with `type` rg -n "workflows\\b|steps\\b.*type|type.*step" packages/core/src | head -n 80

Repository: AgentWorkforce/relayflows

Length of output: 24662

🏁 Script executed:

#!/bin/bash set -euo pipefail # Find where harness-driver is mocked anywhere in core tests rg -n "vi\\.mock\\(['\"]`@agent-relay/harness-driver`['\"]\$" packages/core/src/__tests__ || true rg -n "setup.*vitest|vitest\\.setup|setupFiles|testSetup|beforeAll\\(" packages/core/src/__tests__ packages/core/src || true # Inspect runner.ts around where it throws AgentRelay not initialized and any other relay.js usage rg -n "AgentRelay not initialized|AgentRelay" packages/core/src/runner.ts sed -n '6560,6725p' packages/core/src/runner.ts | cat -n # Inspect relay.js module itself for any legacy addListener API RELAY="packages/core/src/relay.ts" if [ -f "$RELAY" ]; then sed -n '1,220p' "$RELAY" | cat -n else RELAYJS="packages/core/src/relay.js" if [ -f "$RELAYJS" ]; then sed -n '1,220p' "$RELAYJS" | cat -n fi fi

Repository: AgentWorkforce/relayflows

Length of output: 7928

Fix start-from.test.ts relay mocking to match WorkflowRunner’s harness-driver contract

The test still mocks legacy ../relay.js/addListener behavior, but WorkflowRunner now creates the broker with HarnessDriverClient.spawn(...) and consumes broker events via this.relay.onEvent(...). There is no vi.mock('@agent-relay/harness-driver') in this test, so the start-from path isn’t exercising the same broker lifecycle APIs as production.

const { WorkflowRunner } = await import('../runner.js'); const { workflow } = await import('../builder.js');

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/core/src/__tests__/start-from.test.ts` around lines 121 - 122, The test currently mocks the legacy relay API but not the new harness-driver contract used by WorkflowRunner; update start-from.test.ts to mock '`@agent-relay/harness-driver`' so that HarnessDriverClient.spawn(...) is replaced with a test stub that returns an object matching the broker used by WorkflowRunner (i.e., implement spawn to return an object with an onEvent method and any other methods WorkflowRunner expects), remove or stop relying on legacy ../relay.js/addListener mocks, and ensure the test triggers relay events by calling the stubbed broker.onEvent handlers so the start-from path exercises the real broker lifecycle APIs used by WorkflowRunner and this.relay.onEvent.

coderabbitai · 2026-06-04T16:59:21Z

      getFailureResult: (error) => ({
        status: 'failed',
        output: '',
-        error: error instanceof Error ? error.message : String(error),
+        error: (() => { if (error instanceof Error && /reading 'get'/.test(error.message)) console.error('DEBUG_STACK>>>', error.stack); return error instanceof Error ? error.message : String(error); })(),
        retries: state.row.retryCount,


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove the unconditional stack dump from failure serialization.

These two getFailureResult() branches print error.stack whenever the message matches reading 'get', regardless of RF_DEBUG_STACK. That leaks internal traces into normal runs and makes logging depend on error text. Reuse the existing env-gated debug logging in the surrounding catch paths instead.

Also applies to: 4408-4411

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/core/src/runner.ts` around lines 4342 - 4346, The getFailureResult() branches are unconditionally printing error.stack when the message matches /reading 'get'/ which leaks stacks; remove the console.error('DEBUG_STACK>>>', error.stack) calls inside those inline error serialization blocks (the getFailureResult implementations) and instead rely on the existing RF_DEBUG_STACK gated debug logging used in the surrounding catch handlers; if you need to preserve stack output here, guard it with the same env check (process.env.RF_DEBUG_STACK) or call the centralized debug logger rather than printing unconditionally. Also remove the identical unconditional stack dump in the second failure-serialization branch referenced in the comment (the other getFailureResult-like block) so both locations use the env-gated debug path.

cubic-dev-ai

6 issues found across 32 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/src/runner.ts">

<violation number="1" location="packages/core/src/runner.ts:4345">
P2: Leftover debug instrumentation logs stack traces from failure paths; remove the inline `console.error` side effect from error serialization.</violation>
</file>

<file name="docs/sdk-v8-migration-plan.md">

<violation number="1" location="docs/sdk-v8-migration-plan.md:3">
P3: Document status header says 'proposed, not started' but WS-0 is marked 'done & verified' and the PR description confirms WS-0 work (vendored CLI utils, strip-ansi rename) was completed. The header is misleading about actual progress.</violation>

<violation number="2" location="docs/sdk-v8-migration-plan.md:94">
P3: This workstream section still documents a migration path around `AgentRelay`, but the implementation has already moved to `HarnessDriverClient`. Update WS-1/WS-4 (or mark them historical) so follow-up edits don’t target removed APIs.</violation>

<violation number="3" location="docs/sdk-v8-migration-plan.md:126">
P2: WS-3 event mapping table targets the wrong API surface, contradicting Section 2's corrected finding. Section 2 establishes that the correct migration target is HarnessDriverClient.addListener with verbatim event names ('workerOutput', 'messageReceived', etc.), not @agent-relay/sdk messaging events ('message.created', 'terminal.output'). Following WS-3 as written would re-implement the large rewrite that Section 2 explicitly rejects.</violation>
</file>

_{Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.

Re-trigger cubic}

cubic-dev-ai · 2026-06-04T17:03:13Z

        status: 'failed',
        output: '',
-        error: error instanceof Error ? error.message : String(error),
+        error: (() => { if (error instanceof Error && /reading 'get'/.test(error.message)) console.error('DEBUG_STACK>>>', error.stack); return error instanceof Error ? error.message : String(error); })(),


P2: Leftover debug instrumentation logs stack traces from failure paths; remove the inline console.error side effect from error serialization.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/core/src/runner.ts, line 4345: <comment>Leftover debug instrumentation logs stack traces from failure paths; remove the inline `console.error` side effect from error serialization.</comment> <file context> @@ -4397,7 +4342,7 @@ export class WorkflowRunner { status: 'failed', output: '', - error: error instanceof Error ? error.message : String(error), + error: (() => { if (error instanceof Error && /reading 'get'/.test(error.message)) console.error('DEBUG_STACK>>>', error.stack); return error instanceof Error ? error.message : String(error); })(), retries: state.row.retryCount, exitCode: lastExitCode, </file context>

cubic-dev-ai · 2026-06-04T17:03:13Z

@@ -0,0 +1,181 @@
+# `@relayflows/core` — `@agent-relay/sdk` v7 → v8 migration plan


P2: WS-3 event mapping table targets the wrong API surface, contradicting Section 2's corrected finding. Section 2 establishes that the correct migration target is HarnessDriverClient.addListener with verbatim event names ('workerOutput', 'messageReceived', etc.), not @agent-relay/sdk messaging events ('message.created', 'terminal.output'). Following WS-3 as written would re-implement the large rewrite that Section 2 explicitly rejects.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At docs/sdk-v8-migration-plan.md, line 126: <comment>WS-3 event mapping table targets the wrong API surface, contradicting Section 2's corrected finding. Section 2 establishes that the correct migration target is HarnessDriverClient.addListener with verbatim event names ('workerOutput', 'messageReceived', etc.), not @agent-relay/sdk messaging events ('message.created', 'terminal.output'). Following WS-3 as written would re-implement the large rewrite that Section 2 explicitly rejects.</comment> <file context> @@ -0,0 +1,181 @@ + +| v7 listener (fields) | v8 source | +|---|---| +| `workerOutput` `{name, chunk}` (3158) | session event `terminal.output`/`transcript.chunk` → `event.text`/`chunk` | +| `messageReceived` `{eventId,from,to,text,threadId}` (3206) | `addListener('message.created')` → `event.message.text`, `event.envelope.from/to`, `event.message.parentId` | +| `agentSpawned` `{name,runtime}` (3254) | session `status.active` / spawn ack from driver runtime | </file context>

cubic-dev-ai · 2026-06-04T17:03:13Z

@@ -0,0 +1,181 @@
+# `@relayflows/core` — `@agent-relay/sdk` v7 → v8 migration plan
+
+> Status: **proposed, not started.** This plan was produced from a read-only


P3: Document status header says 'proposed, not started' but WS-0 is marked 'done & verified' and the PR description confirms WS-0 work (vendored CLI utils, strip-ansi rename) was completed. The header is misleading about actual progress.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At docs/sdk-v8-migration-plan.md, line 3: <comment>Document status header says 'proposed, not started' but WS-0 is marked 'done & verified' and the PR description confirms WS-0 work (vendored CLI utils, strip-ansi rename) was completed. The header is misleading about actual progress.</comment> <file context> @@ -0,0 +1,181 @@ +# `@relayflows/core` — `@agent-relay/sdk` v7 → v8 migration plan + +> Status: **proposed, not started.** This plan was produced from a read-only +> investigation of `packages/core` and the relay monorepo HEAD (all `@agent-relay/*` +> at `8.0.4` source / `8.1.2` npm). Review before any code is written. </file context>

Suggested change

> Status: **proposed, not started.** This plan was produced from a read-only

> Status: **in progress.** WS-0 (utility vendoring) complete; remaining workstreams WS-1 through WS-5 still to be sequenced.

cubic-dev-ai · 2026-06-04T17:03:13Z

@@ -0,0 +1,181 @@
+# `@relayflows/core` — `@agent-relay/sdk` v7 → v8 migration plan


P3: This workstream section still documents a migration path around AgentRelay, but the implementation has already moved to HarnessDriverClient. Update WS-1/WS-4 (or mark them historical) so follow-up edits don’t target removed APIs.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At docs/sdk-v8-migration-plan.md, line 94: <comment>This workstream section still documents a migration path around `AgentRelay`, but the implementation has already moved to `HarnessDriverClient`. Update WS-1/WS-4 (or mark them historical) so follow-up edits don’t target removed APIs.</comment> <file context> @@ -0,0 +1,181 @@ +- Move `BrokerEvent` / `AgentSpawner` type imports to `@agent-relay/harness-driver` + (`runner.ts:29,126`). + +### WS-1 — AgentRelay construction & options (`runner.ts:3145`, types in `builder.ts:4`, `run.ts:1`) +- v7 `new AgentRelay({ brokerName, channels, env, requestTimeoutMs })` → + v8 `new AgentRelay({ workspaceKey, baseUrl, retryPolicy, harness })`. </file context>

Reconcile the @relayflows/core test suite to the migrated runner and the bumped @agent-relay 8.x dependencies. Treats the runner/SDK as source of truth and updates test expectations to match. Test infrastructure: - Cap the vitest fork pool (maxForks: 4) so workflow tests that spawn child processes can't overwhelm the host. - Add a Vite plugin to externalize the `node:sqlite` builtin, which Vite's resolver does not yet recognize (collectors load it at import time). Eliminate real broker/PTY spawning in tests: - start-from / resume-fallback mocked stale module paths (`@relaycast/sdk`, `../relay.js`) that the migrated runner no longer imports, so execute() hit the real HarnessDriverClient and spawned live runtimes. Switch both to the runner's `executor` injection seam — no broker init, no spawn. - error-scenarios: drop a dead `vi.mock('@agent-relay/sdk/relay')` whose subpath was removed in sdk v8 (it failed the whole suite at load); the covered tests only exercise validation/early-reject paths. Reconcile to current runtime behavior: - Handle the SpawnedAgentHandle `{ reason }` contract in inline custom mock handles (raw reason strings destructured to undefined → undetected timeouts / hangs). - Opt failure-path supervised/owner-timeout tests into errorHandling.strategy:'fail-fast' so they exercise first-pass failures instead of the runner's auto-enabled repair/retry loop (was hanging to the 30s timeout). - agent-trajectories v0.6 moved the default data dir to `.agentworkforce/trajectories` (per-id subdirs); update trajectory readers. - @agent-relay/cloud v8 moved the permission-audit path and now grants write on all non-denied files under readwrite access; update expectations and convert provisioner-audit from node:test to vitest. - @relayfile/sdk v0.8 only removes mount points it created itself; let the SDK own the mount so stop() exercises cleanup. run-script.ts: validate file extension before existence so an unsupported file type is reported as such regardless of whether the path exists. Full core suite: 769 passing. Typecheck and build green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

codeant-ai · 2026-06-05T04:00:41Z

CodeAnt AI is reviewing your PR.

codeant-ai · 2026-06-05T04:05:37Z

CodeAnt AI finished reviewing your PR.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/core/src/__tests__/idle-nudge.test.ts (1)
168-175: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add eventListeners.clear() to prevent test pollution.

The eventListeners Set is defined at module scope (Line 80) and accumulates callbacks via onEvent registration, but it's never cleared between tests. This can cause event listeners from one test to leak into subsequent tests.

Other test files with the same pattern (e2e-owner-review.test.ts, e2e-permissions.test.ts, permissions-integration.test.ts) all clear the Set in beforeEach.
🧹 Proposed fix
 beforeEach(() => {
   vi.clearAllMocks();
+  eventListeners.clear();
   db = makeDb();
   runner = new WorkflowRunner({ db, workspaceId: 'ws-test' });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/__tests__/idle-nudge.test.ts` around lines 168 - 175, The
module-level Set eventListeners is accumulating callbacks across tests; add a
call to eventListeners.clear() inside the beforeEach block in this test file so
the Set is reset between tests. Locate the beforeEach that creates db, runner
and mocks (uses waitForExitFn and waitForIdleFn) and insert
eventListeners.clear() at the start of that beforeEach to prevent listener
leakage from prior tests (mirrors the pattern used in e2e-owner-review.test.ts
and others).

🧹 Nitpick comments (1)

packages/core/src/__tests__/verification-custom.test.ts (1)
196-200: ⚡ Quick win

Add eventListeners.clear() to prevent listener accumulation.

The beforeEach hook clears mocks but does not clear the eventListeners Set. Other test files in this PR (verification-traceback.test.ts line 317, workflow-runner.test.ts line 291) clear eventListeners in their beforeEach hooks. While this test may not actively emit events, clearing the set ensures consistency and prevents potential test pollution.
🧹 Suggested cleanup
  beforeEach(() => {
    vi.clearAllMocks();
    queuedSubprocessResults = [];
+   eventListeners.clear();
  });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/__tests__/verification-custom.test.ts` around lines 196 -
200, The beforeEach hook in the 'custom verification' test clears mocks and
resets queuedSubprocessResults but does not clear the eventListeners Set, which
may cause listener accumulation across tests; update the beforeEach block to
call eventListeners.clear() (referencing the eventListeners Set and the
beforeEach function in this test) so event listeners are cleared before each
test run.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@packages/core/src/__tests__/idle-nudge.test.ts`:
- Around line 168-175: The module-level Set eventListeners is accumulating
callbacks across tests; add a call to eventListeners.clear() inside the
beforeEach block in this test file so the Set is reset between tests. Locate the
beforeEach that creates db, runner and mocks (uses waitForExitFn and
waitForIdleFn) and insert eventListeners.clear() at the start of that beforeEach
to prevent listener leakage from prior tests (mirrors the pattern used in
e2e-owner-review.test.ts and others).

---

Nitpick comments:
In `@packages/core/src/__tests__/verification-custom.test.ts`:
- Around line 196-200: The beforeEach hook in the 'custom verification' test
clears mocks and resets queuedSubprocessResults but does not clear the
eventListeners Set, which may cause listener accumulation across tests; update
the beforeEach block to call eventListeners.clear() (referencing the
eventListeners Set and the beforeEach function in this test) so event listeners
are cleared before each test run.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: b0e9649e-4533-41fa-ac86-046113eaa0ca

📥 Commits

Reviewing files that changed from the base of the PR and between dda4aaf and 4c32182.

📒 Files selected for processing (22)

packages/core/src/__tests__/budget-enforcement.test.ts
packages/core/src/__tests__/completion-pipeline.test.ts
packages/core/src/__tests__/e2big-and-verify.test.ts
packages/core/src/__tests__/e2e-owner-review.test.ts
packages/core/src/__tests__/e2e-permissions.test.ts
packages/core/src/__tests__/error-scenarios.test.ts
packages/core/src/__tests__/idle-nudge.test.ts
packages/core/src/__tests__/permissions-integration.test.ts
packages/core/src/__tests__/provisioner-audit.test.ts
packages/core/src/__tests__/provisioner-mount.test.ts
packages/core/src/__tests__/resume-fallback.test.ts
packages/core/src/__tests__/run-summary-table.test.ts
packages/core/src/__tests__/start-from.test.ts
packages/core/src/__tests__/step-cwd.test.ts
packages/core/src/__tests__/verification-custom.test.ts
packages/core/src/__tests__/verification-traceback.test.ts
packages/core/src/__tests__/workflow-runner.test.ts
packages/core/src/__tests__/workflow-trajectory.test.ts
packages/core/src/process-spawner.ts
packages/core/src/run-script.ts
packages/core/vitest.config.ts
vitest.config.ts

🚧 Files skipped from review as they are similar to previous changes (2)

packages/core/src/tests/error-scenarios.test.ts
packages/core/src/tests/completion-pipeline.test.ts

cubic-dev-ai

2 issues found across 22 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="vitest.config.ts">

<violation number="1" location="vitest.config.ts:19">
P3: `load('node:sqlite')` is dead code because `resolveId` always marks the module as external.</violation>
</file>

<file name="packages/core/src/process-spawner.ts">

<violation number="1" location="packages/core/src/process-spawner.ts:82">
P1: Completion detection uses reordered stdout/stderr aggregation, which can trigger false early-success termination.</violation>
</file>

_{Reply with feedback, questions, or to request a fix.

Re-trigger cubic}

cubic-dev-ai · 2026-06-05T04:09:45Z


+    const notify = () => {
+      if (onChunk && !settled) {
+        onChunk(`${stdout.join('')}${stderr.join('')}`);


P1: Completion detection uses reordered stdout/stderr aggregation, which can trigger false early-success termination.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/core/src/process-spawner.ts, line 82: <comment>Completion detection uses reordered stdout/stderr aggregation, which can trigger false early-success termination.</comment> <file context> @@ -68,18 +68,29 @@ export function spawnProcess(command: string[], options: SpawnOptions): ChildPro + const notify = () => { + if (onChunk && !settled) { + onChunk(`${stdout.join('')}${stderr.join('')}`); + } + }; </file context>

cubic-dev-ai · 2026-06-05T04:09:45Z

+    enforce: 'pre',
+    resolveId(id) {
+      if (id === 'node:sqlite' || id === 'sqlite') {
+        return { id: 'node:sqlite', external: true };


P3: load('node:sqlite') is dead code because resolveId always marks the module as external.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At vitest.config.ts, line 19: <comment>`load('node:sqlite')` is dead code because `resolveId` always marks the module as external.</comment> <file context> @@ -1,9 +1,47 @@ + enforce: 'pre', + resolveId(id) { + if (id === 'node:sqlite' || id === 'sqlite') { + return { id: 'node:sqlite', external: true }; + } + return null; </file context>

willwashburn marked this pull request as ready for review June 4, 2026 16:48

chatgpt-codex-connector Bot reviewed Jun 4, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 4, 2026

View reviewed changes

cubic-dev-ai Bot reviewed Jun 4, 2026

View reviewed changes

codeant-ai Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files label Jun 5, 2026

coderabbitai Bot reviewed Jun 5, 2026

View reviewed changes

cubic-dev-ai Bot reviewed Jun 5, 2026

View reviewed changes

willwashburn merged commit 4360cd7 into main Jun 5, 2026
2 checks passed

		const { WorkflowRunner } = await import('../runner.js');
		const { workflow } = await import('../builder.js');

		@@ -0,0 +1,181 @@
		# `@relayflows/core` — `@agent-relay/sdk` v7 → v8 migration plan

		@@ -0,0 +1,181 @@
		# `@relayflows/core` — `@agent-relay/sdk` v7 → v8 migration plan

		> Status: proposed, not started. This plan was produced from a read-only

	> Status: proposed, not started. This plan was produced from a read-only
	> Status: in progress. WS-0 (utility vendoring) complete; remaining workstreams WS-1 through WS-5 still to be sequenced.

Uh oh!

Conversation

willwashburn commented Jun 4, 2026 • edited by codeant-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

Summary

Primitives

Core SDK migration (v7 → v8)

Tests — ⚠️ in progress (why this is a draft)

Summary by cubic

CodeAnt-AI Description

What Changed

Impact

Checking Your Pull Request

Talking to CodeAnt AI

Example

Preserve Org Learnings with CodeAnt

Example

Retrigger review

Check Your Repository Health

Uh oh!

gemini-code-assist Bot commented Jun 4, 2026

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot commented Jun 4, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

codeant-ai Bot commented Jun 5, 2026

Uh oh!

codeant-ai Bot commented Jun 5, 2026

willwashburn commented Jun 4, 2026 •

edited by codeant-ai Bot

Loading

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading